Implement insert_many<IntoIterator> for SmallVec #28

nipunn1313 · 2016-08-20T00:17:37Z

This doesn't work as is, but I wanted to put it up for feedback.

According to these docs:
https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html

The copy_nonoverlapping function copies memory as intended, but the ownership of the values copied are not copied over to dest, so when the box is consumed, the SmallVec does not maintain ownership. Not sure how to do this with unsafe code.

This change is

SimonSapin · 2016-08-20T03:26:51Z

You can not take ownership of T values out of a &[T] slice. The & indicates that it’s a borrow. To keep using a slice, you need either to clone each value or require Copy.

Maybe it’s better to take an IntoIterator rather than a slice as it’s more general and avoids the ownership problem. (Users can choose to call .iter().cloned() if they’re starting with a slice.) But then you have to deal with size_hint being inexact or even wrong, and it kinda gets messy. See rust-lang/rust#32355

nipunn1313 · 2016-08-20T05:17:14Z

Thank you for the feedback! I think I have a much better understanding of why it wasn't working.

I intended on adding the insert_slice to leverage memcpy, so it makes a lot of sense to restrict insert_slice to arrays whose types impl Copy. Insert_many taking an into_iter would also work, but it wouldn't leverage the memcpy advantage, so it would be purely ergonomic. Perhaps there is room for both, but I started here with insert_slice.

nipunn1313 · 2016-08-23T18:16:10Z

I spent a little more time thinking about this and realize that leveraging memcpy is the perfect use of impl specialization. I think I will adapt this diff to implement "insert_many" using an IntoIterator and create a separate PR for using specialization to get a memcpy leveraged implementation.

https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md

SimonSapin · 2016-08-23T18:24:26Z

Unfortunately specialization is not stable yet and I’d like to keep this crate working on stable Rust.

Beyond that, there was also a very similar discussion in Rust’s standard library, about specializing Clone to memcpy where T: Copy. Some argued that doing so would be wrong since the Clone implementation is arbitrary and might have a different behavior. Some argued that such a Clone implementation would be buggy, and that std didn’t need to carter to it.

In the end, specialization was not used and &mut [T] mutable slices now have two methods: copy_from_slice which requires T: Copy and clone_from_slice which requires T: Clone.

nipunn1313 · 2016-08-23T18:29:15Z

Understood. I've adapted this diff to be completely iterator-compatible (take another look now) without using any memcpy type primitives or specialization. This diff should be safe to add as a backward compatible change in its current state, as long as we're willing to include an "insert_many" operation or something of the sort.

I'll create a new diff which will work around the lack of specialization by adding new methods (extend_slice and insert_slice) which take advantage of memcpy semantics for copy type slices.

SimonSapin · 2016-08-23T18:54:22Z

Is there a specific need for all this? What is it? I’d rather not add lots of API surface just because we can.

nipunn1313 · 2016-08-24T06:54:31Z

We were looking to use this here. There was a spot where we needed to insert many items into a smallvec (preferably efficiently). Currently, we've rolled our own "ElasticArray" with this functionality, but would prefer to use a more supported library like this.

https://github.com/ethcore/parity/pull/1957

Add benchmarks for push, insert, extend, and pushpop Want to add these benchmarks so that we can more effectively detect regressions/improvements from PR's like #28 and #29.  --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-smallvec/31)

nipunn1313 · 2016-09-12T01:06:24Z

Here are the benchmarks as seen on my machine for this

Nipunns-MBP:rust-smallvec nipunn$ git checkout master ; cargo bench --features benchmarks bench ; git checkout insert_slice ; cargo bench --features benchmarks bench
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
   Compiling smallvec v0.2.0 (file:///Users/nipunn/src/rust-smallvec)
    Finished release [optimized] target(s) in 2.55 secs
     Running target/release/deps/smallvec-a59689f47ae774cb

running 4 tests
test bench::bench_extend  ... bench:         152 ns/iter (+/- 18)
test bench::bench_insert  ... bench:       1,150 ns/iter (+/- 167)
test bench::bench_push    ... bench:         291 ns/iter (+/- 127)
test bench::bench_pushpop ... bench:         356 ns/iter (+/- 70)

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured

Switched to branch 'insert_slice'
   Compiling smallvec v0.2.0 (file:///Users/nipunn/src/rust-smallvec)
    Finished release [optimized] target(s) in 2.74 secs
     Running target/release/deps/smallvec-a59689f47ae774cb

running 5 tests
test bench::bench_extend      ... bench:         166 ns/iter (+/- 25)
test bench::bench_insert      ... bench:       1,150 ns/iter (+/- 184)
test bench::bench_insert_many ... bench:         500 ns/iter (+/- 139)
test bench::bench_push        ... bench:         276 ns/iter (+/- 39)
test bench::bench_pushpop     ... bench:         366 ns/iter (+/- 124)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured

Looks like insert_many improves upon insert as expected.
Both insert and insert_many benchmarks are inserting 100 items to the beginning of an empty smallvec<16>. Both require a resize, but insert_many will be linear while the loop of insert will be quadratic after the resize.

nipunn1313 · 2016-09-12T01:07:33Z

This is definitely a micro-optimization sensitive piece of code, so each little change can affect benchmarks. Benchmark tests were invaluable in making sure this didn't regress!

nipunn1313 · 2016-09-12T05:14:22Z

I made some minor tweaks and noticed it changed the benchmark output for bench_insert (even though we didn't touch that function).

I disassembled the outputs and have come to the conclusion that different inlining is creating different performance characteristics. On master, the insert() function is inlined into the benchmark, but after my change, the compiler chose to make insert() a function call.

This is all pointing to somewhat bogus benchmarking because external callers of insert() shouldn't be seeing this cross-crate inlining. I think in order to get a true, sense the benchmarks need to be in a separate crate, so I will give that a try (maybe using the tests/ directory?).

Nipunns-MBP:rust-smallvec nipunn$ git checkout master ; cargo bench --features benchmarks bench ; git checkout insert_slice ; cargo bench --features benchmarks bench
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
   Compiling smallvec v0.2.0 (file:///Users/nipunn/src/rust-smallvec)
    Finished release [optimized] target(s) in 2.46 secs
     Running target/release/deps/smallvec-a59689f47ae774cb

running 4 tests
test bench::bench_extend  ... bench:         153 ns/iter (+/- 5)
test bench::bench_insert  ... bench:       1,144 ns/iter (+/- 147)
test bench::bench_push    ... bench:         280 ns/iter (+/- 43)
test bench::bench_pushpop ... bench:         354 ns/iter (+/- 14)

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured

Switched to branch 'insert_slice'
   Compiling smallvec v0.2.0 (file:///Users/nipunn/src/rust-smallvec)
git     Finished release [optimized] target(s) in 2.63 secs
     Running target/release/deps/smallvec-a59689f47ae774cb

running 5 tests
test bench::bench_extend      ... pubench:         157 ns/iter (+/- 16)
test bench::bench_insert      ... sh bench:       1,428 ns/iter (+/- 257)
test bench::bench_insert_many ... obench:         692 ns/iter (+/- 128)
test bench::bench_push        ... bench:         276 ns/iter (+/- 34)
test bench::bench_pushpop     ... bench:         354 ns/iter (+/- 111)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured

In any case, I think this iteration of the code is pretty good, but I'd like to improve the benchmarks so we don't see this regression due to inlining.

nipunn1313 · 2016-09-12T07:53:46Z

Going to wait for #32 to land first since it sets up the noinline bench wrapper to get more consistent results.

bors-servo · 2016-09-14T19:25:27Z

☔ The latest upstream changes (presumably #32) made this pull request unmergeable. Please resolve the merge conflicts.

Add benchmark for insert_many

nipunn1313 · 2016-09-21T07:37:15Z

Some improvements

Rebased onto the noinline benchmark suite
Added tests for cases with bad size hints
Optimized copying to eliminate n^2 behavior of many-inserts
Added a benchmark for insert_many 100 elements followed by insert_many 100 elements again. Tests for O(n) performance rather than O(n^2)

Nipunns-MBP:rust-smallvec nipunn$ git checkout master ; cargo bench bench ; git checkout insert_slice ; cargo bench bench
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
   Compiling smallvec v0.2.0 (file:///Users/nipunn/src/rust-smallvec)
    Finished release [optimized] target(s) in 2.32 secs
     Running target/release/bench-ef222b4531a86f9a

running 4 tests
test bench_extend  ... bench:         146 ns/iter (+/- 23)
test bench_insert  ... bench:       1,374 ns/iter (+/- 123)
test bench_push    ... bench:         439 ns/iter (+/- 128)
test bench_pushpop ... bench:         359 ns/iter (+/- 115)

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured

     Running target/release/deps/smallvec-a59689f47ae774cb

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured

Switched to branch 'insert_slice'
   Compiling smallvec v0.2.0 (file:///Users/nipunn/src/rust-smallvec)
    Finished release [optimized] target(s) in 2.62 secs
     Running target/release/bench-ef222b4531a86f9a

running 5 tests
test bench_extend      ... bench:         153 ns/iter (+/- 30)
test bench_insert      ... bench:       1,433 ns/iter (+/- 610)
test bench_insert_many ... bench:         497 ns/iter (+/- 180)
test bench_push        ... bench:         429 ns/iter (+/- 155)
test bench_pushpop     ... bench:         358 ns/iter (+/- 171)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured

     Running target/release/deps/smallvec-a59689f47ae774cb

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured

I feel good about this iteration for handling the insert-many use case.

jdm

Sorry for the delay in reviewing! I wanted to take the time to reason through this implementation carefully, and I'm glad I did!

jdm · 2016-09-23T18:10:04Z

lib.rs

+        unsafe {
+            let ptr = self.as_mut_ptr().offset(index as isize);
+            let old_len = self.len;
+            ptr::copy(ptr, ptr.offset(lower_size_bound as isize), old_len - index);


We need to assert index <= old_len or this is unsafe.

jdm · 2016-09-23T18:54:46Z

lib.rs

+            }
+            let num_added = self.len - old_len;
+            if num_added < lower_size_bound {
+                // Iterator provided less elements than the hint


s/less/fewer/

jdm · 2016-09-23T19:04:16Z

lib.rs

+        self.reserve(lower_size_bound);
+
+        unsafe {
+            let ptr = self.as_mut_ptr().offset(index as isize);


All of these casts to isize when offsetting are unsafe, since they could become negative numbers and cause us to write outside of the bounds of the array. We should either use to_isize.unwrap() and panic if that occurs, or use a conversion strategy that yields a value that will gives us worse performance but correct behaviour.

nipunn1313 · 2016-10-03T20:39:30Z

Thank you for the careful review! I'm taking another pass at the change with careful attention to integer overflow and casting cases.

nipunn1313 · 2016-10-03T20:57:42Z

At a quick readthrough, it looks like push, pop, truncate, remove, insert all suffer from the same issue if len > isize::MAX.

It looks like one might be able to manipulate that situation in a call to insert() when len == isize::MAX (then the len would get set to size::MAX + 1).

In any case, here is an updated revision with some bounds checks to insert_many.

jdm · 2017-01-19T23:53:01Z

I'm terribly sorry that I dropped the ball here!
@bors-servo: r+

bors-servo · 2017-01-19T23:53:02Z

📌 Commit 1c254ff has been approved by jdm

bors-servo · 2017-01-19T23:53:06Z

⌛ Testing commit 1c254ff with merge 03c016c...

Implement insert_many<IntoIterator> for SmallVec This doesn't work as is, but I wanted to put it up for feedback. According to these docs: https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html The copy_nonoverlapping function copies memory as intended, but the ownership of the values copied are not copied over to dest, so when the box is consumed, the SmallVec does not maintain ownership. Not sure how to do this with unsafe code.  --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-smallvec/28)

bors-servo · 2017-01-19T23:53:59Z

☀️ Test successful - status-travis

Implement extend_from_slice and insert_from_slice with memmove optimization Implement the performance optimized versions of insert_many and extend (from #28). These methods use memmove rather than a looped insert. If we had function specialization, we could implement these without exposing new methods. Up to the maintainers whether we want to support these new methods.  --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-smallvec/29)

nipunn1313 force-pushed the insert_slice branch from 005a827 to 3f7606f Compare August 20, 2016 05:14

nipunn1313 changed the title ~~Try to implement insert_slice~~ Implement insert_slice for SmallVec Aug 20, 2016

nipunn1313 mentioned this pull request Aug 20, 2016

Migrate rlp from ElasticArray to SmallVec openethereum/parity-ethereum#1957

Closed

nipunn1313 force-pushed the insert_slice branch from 3f7606f to 85a4a07 Compare August 23, 2016 18:25

nipunn1313 changed the title ~~Implement insert_slice for SmallVec~~ Implement insert_many<IntoIterator> for SmallVec Aug 23, 2016

nipunn1313 mentioned this pull request Aug 23, 2016

Implement extend_from_slice and insert_from_slice with memmove optimization #29

Merged

nipunn1313 mentioned this pull request Aug 27, 2016

Add benchmarks for push, insert, extend, and pushpop #31

Merged

nipunn1313 force-pushed the insert_slice branch from 85a4a07 to e7246e9 Compare September 12, 2016 00:30

nipunn1313 force-pushed the insert_slice branch from d7175ae to a9fc702 Compare September 12, 2016 05:10

Implement insert_slice for SmallVec

15c41c8

Add benchmark for insert_many

nipunn1313 force-pushed the insert_slice branch from a9fc702 to 15c41c8 Compare September 21, 2016 07:34

jdm requested changes Sep 23, 2016

View reviewed changes

Assert to check for overflow

1c254ff

jdm mentioned this pull request Oct 3, 2016

Prevent memory unsafety when array offset values are >isize::MAX #34

Open

bors-servo merged commit 1c254ff into servo:master Jan 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement insert_many<IntoIterator> for SmallVec #28

Implement insert_many<IntoIterator> for SmallVec #28

nipunn1313 commented Aug 20, 2016 •

edited by larsbergstrom

Loading

SimonSapin commented Aug 20, 2016

nipunn1313 commented Aug 20, 2016

nipunn1313 commented Aug 23, 2016

SimonSapin commented Aug 23, 2016

nipunn1313 commented Aug 23, 2016

SimonSapin commented Aug 23, 2016

nipunn1313 commented Aug 24, 2016

nipunn1313 commented Sep 12, 2016

nipunn1313 commented Sep 12, 2016

nipunn1313 commented Sep 12, 2016

nipunn1313 commented Sep 12, 2016

bors-servo commented Sep 14, 2016

nipunn1313 commented Sep 21, 2016 •

edited

Loading

jdm left a comment

jdm Sep 23, 2016 •

edited

Loading

jdm Sep 23, 2016

jdm Sep 23, 2016

nipunn1313 commented Oct 3, 2016

nipunn1313 commented Oct 3, 2016

jdm commented Jan 19, 2017

bors-servo commented Jan 19, 2017

bors-servo commented Jan 19, 2017

bors-servo commented Jan 19, 2017

Implement insert_many<IntoIterator> for SmallVec #28

Implement insert_many<IntoIterator> for SmallVec #28

Conversation

nipunn1313 commented Aug 20, 2016 • edited by larsbergstrom Loading

SimonSapin commented Aug 20, 2016

nipunn1313 commented Aug 20, 2016

nipunn1313 commented Aug 23, 2016

SimonSapin commented Aug 23, 2016

nipunn1313 commented Aug 23, 2016

SimonSapin commented Aug 23, 2016

nipunn1313 commented Aug 24, 2016

nipunn1313 commented Sep 12, 2016

nipunn1313 commented Sep 12, 2016

nipunn1313 commented Sep 12, 2016

nipunn1313 commented Sep 12, 2016

bors-servo commented Sep 14, 2016

nipunn1313 commented Sep 21, 2016 • edited Loading

jdm left a comment

Choose a reason for hiding this comment

jdm Sep 23, 2016 • edited Loading

Choose a reason for hiding this comment

jdm Sep 23, 2016

Choose a reason for hiding this comment

jdm Sep 23, 2016

Choose a reason for hiding this comment

nipunn1313 commented Oct 3, 2016

nipunn1313 commented Oct 3, 2016

jdm commented Jan 19, 2017

bors-servo commented Jan 19, 2017

bors-servo commented Jan 19, 2017

bors-servo commented Jan 19, 2017

nipunn1313 commented Aug 20, 2016 •

edited by larsbergstrom

Loading

nipunn1313 commented Sep 21, 2016 •

edited

Loading

jdm Sep 23, 2016 •

edited

Loading